Picture for Kai Yu

Kai Yu

Sherman

ProductWebGen: Benchmarking Multimodal Product Webpage Generation

Add code
May 31, 2026
Viaarxiv icon

OpenSTBench: Beyond Semantic Evaluation for Speech Translation

Add code
May 29, 2026
Viaarxiv icon

A Unified and Reproducible Experimentation Framework for Speech Understanding

Add code
May 29, 2026
Viaarxiv icon

HoliTok:A Coutinuous Holistic Tokenization with Robust Dual Capabilities of Speech Generation and Understanding

Add code
May 28, 2026
Viaarxiv icon

DeepSurvey: Enhancing Analytical Depth and Citation Reliability in Automated Survey Generation

Add code
May 28, 2026
Viaarxiv icon

Towards Human-Like Interactive Speech Recognition With Agentic Correction and Semantic Evaluation

Add code
May 28, 2026
Viaarxiv icon

Audio-Mind: An Auditable Agentic Framework for Audio Understanding

Add code
May 27, 2026
Viaarxiv icon

Artificial Intelligence-Assistant Cardiotocography: Unified Model for Signal Reconstruction, Fetal Heart Rate Analysis, and Variability Assessment

Add code
May 14, 2026
Viaarxiv icon

Good to Go: The LOOP Skill Engine That Hits 99% Success and Slashes Token Usage by 99% via One-Shot Recording and Deterministic Replay

Add code
May 14, 2026
Viaarxiv icon

No Action Without a NOD: A Heterogeneous Multi-Agent Architecture for Reliable Service Agents

Add code
May 12, 2026
Viaarxiv icon